Keyword and keyphrase extraction is an important problem in natural languageprocessing, with applications ranging from summarization to semantic search todocument clustering. Graph-based approaches to keyword and keyphrase extractionavoid the problem of acquiring a large in-domain training corpus by applyingvariants of PageRank algorithm on a network of words. Although graph-basedapproaches are knowledge-lean and easily adoptable in online systems, itremains largely open whether they can benefit from centrality measures otherthan PageRank. In this paper, we experiment with an array of centralitymeasures on word and noun phrase collocation networks, and analyze theirperformance on four benchmark datasets. Not only are there centrality measuresthat perform as well as or better than PageRank, but they are much simpler(e.g., degree, strength, and neighborhood size). Furthermore, centrality-basedmethods give results that are competitive with and, in some cases, better thantwo strong unsupervised baselines.
展开▼